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The fastest racetrack on the planet 




Trillions of protons will race around the 27km ring in 
opposite directions over 1 1 ,000 times a second, travelling 
at 99.999999991 per cent the speed of light. 



The emptiest space in the solar system... 





To accelerate protons to almost the speed of light requires a 
vacuum as empty as interplanetary space. There is 10 times 
more atmosphere on the moon than there will be in the LHC. 



One of the coldest places in the universe 
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With an operating temperature of about -271 degrees 
Celsius, just 1 .9 degrees above absolute zero, the LHC is 
colder than outer space. 



The hottest spots in the galaxy... 





When two beams of protons collide, they will generate 
temperatures 1000 million times hotter than the heart of the 
sun, but in a minuscule space. 



The biggest most sophisticated detectors ever built... 




To sample and record the debris from up to 600 million 
proton collisions per second, scientists are building 
gargantuan devices that measure particles with micron 
precision. 



The most extensive computer system in the world 
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To analyse the data, tens of thousands of computers around 
the world are being harnessed in the Grid. The laboratory 
that gave the world the web, is now taking distributed 
computing a big step further. 



Why? 



To push back the frontiers of knowledge... 






Newton's unfinished business... what is mass? 

Science's little embarrassment... what is 96% of the Universe made of? 

Nature's favouritism... why is there no more antimatter? 

The secrets of the Big Bang... what was matter like within the first second of the 
Universe's life? 
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To push back the frontiers of knowledge... 






Newton's unfinished business... what is mass? 

Science's little embarrassment... what is 96% of the Universe made of? 

Nature's favouritism... why is there no more antimatter? 

The secrets of the Big Bang... what was matter like within the first second of the 
Universe's life? 
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To push back the frontiers of knowledge... 






Newton's unfinished business... what is mass? 

Science's little embarrassment... what is 96% of the Universe made of? 

Nature's favouritism... why is there no more antimatter? 

The secrets of the Big Bang... what was matter like within the first second of the 
Universe's life? 



To develop new technologies 





Information technology - the Web and the Grid 

Medicine - diagnosis and therapy 

Security - scanning technologies for harbours and airports 

Vacuum - new techniques for flat screen displays or solar energy devices 



To unite people from different countries and cultures 




20 Member states 

38 Countries with cooperation agreements 

111 Nationalities 

10000 People 



To train the scientists and engineers of tomorrow... 




From mini-Einstein workshops for five to sixes, through to professional schools in 
physics, accelerator science and IT, CERN plays a valuable role in building 
enthusiasm for science and providing formal training.. 
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ATLAS; 

- General purpose 
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institutes 



Barrel Tore id 



Inner Detector 



ACORDE 



Hadronie Calorimeters 



Shielding 



ABSORBER 



MUOM FILTER 



TRIGGER CHAMBERS 




ion /coifisionipli 
- 50,000 particles in,each,QQllision 
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- to study the differences between matter and 
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antimatter 






The accelerator generates 40 million 
particle collisions (events) every 
second at the centre of each of the 
four experiments' detectors 
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reduced by online computers to 
a few hundred "good" ©vents 



per second. 
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Which are recorded on disk and magnetic tape 

at 1 00-1 ,000 MegaBytes/see^ -1 5 PetaBytes per 

year 
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Data Handling and 
Computation for 
Physics Analysis 
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analysis objects 

(extracted by physics topic) 




interactive 

physics 

analysis 
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Summary of Computing Resource Requirements 



Ail eyoeriments - 2008 

From LCG TDR - June 2005 



CPU (MSPECint2000s) 
Disk (PetaBytes) 
Tape (PetaBytes) 
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2 , 580 PCs 



All Tier-Zs 

43% 




All Tier-Is 

35% 



CERN 



All Tier-Is All Tier-2s 





56 
31 
35 



61 
19 



Total 

57 
53 



1,500 boxes 
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Tier-0 - the accelerator centre 

Data acquisition & initial processing 
Long-term data curation 
Distribution of data Tier-1 centres 





Canada - Trium 
France - IN2P3 ( 
Germany - Forscfi 
Italy - CNAF (Bolo 
Netherlands - NIK 
Nordic countries - 









:rum Karls 






areelon 
aj&an - Academi 
UKF- CLRC (Oxf 

PfSARA (Amsterdam) us " ** m }\* b (fltmapp 
istributed Tier-1 - Brookhaven (NY)* 



Tier-1 - "online" to the data acquisition 
process high availability 

Managed Mass Storage 

Data-heavy analysis 

National, regional support 
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a (Taipei) 
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Tier-2 - -100 centres in -40 countries 
§ Simulation 
§ End-user analysis - batch and interactive 
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Timely Technology! 

Deploy to meet LHC 
computing needs. 

Challenges for the 

Worldwide 

LHC 

Computing 

Grid Project due to 

_ worldwide nature 

• competing 
middleware- •• 

_ newness of 
technology 

• competing 
middleware- •• 

scale 




WLCG service relies on three Grid 
infrastructures: EGEE, OSG and NorduGrid 

Interoperability required (and achieved) for 

_ users (job submission) 

_ administration (identity, monitoring, accounting, •••) 




A map of the worldwide LCG infrastructure operated by EGEE and 0$Q. 
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Interoperability in action 
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Statistics: 




Submitted: 
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Waiting: 
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Ready: 
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Scheduled: 


9464 ■ 


Running: 


12198 


Done: 


7319 


Aborted: 


3468 ■ 


Cancelled: 


93 ■ 


Active Sites: 


156:35203 
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Developed by e-Science s HEP 

Imperial College 
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^3 SQ http : //www . do . com/article/ 1 35700/Seven_Wonders_of_the_IT_World/4 
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Google 




White Papers I Blogs Webcasts Podcasts Executive Programs Solution Centers Newsletters RSS Feeds 



Custom Search 



SEARCH 
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□ LEADERSHIP 



Resource Alerts 

GETNOTIFIEDE 



Seven Wonders of the IT World 

The fastest supercomputer. The most intriguing data center. The constantly changing core at the heart of 
Linux. Take a tour of the most impressive and most unusual man/els of the IT world. 



{==} Leave a comment (65) 

By CG. Lynch 

PAGE 4 

World's largest scientific grid computing project: 
The E-sciencE II (EGEE-II) project 

Launched: September 2006, for use by scientists around the 
world. 



NEWSLETTERS 



CIO.com updates, insights and 
advice on technology, 
management and your career. 

] Advice and Opinion 

□ CIO Consumer IT 

□ CIO Leader 

] CIO Enterprise 

□ CIO Insider 

More Newsletters I Edit Profile 




A £oo*jle Earth view of 
Eiiiopenin sites hoohecl 
into the EGEE <ji i<l 
computing |>i eject 



Helps power: Large-scale 
scientific research projects in fields 
from geology to chemistry — for 
example, will analyze data from 
CERN's Large Hadron Collider, a 
particle accelerator being built to 
help investigate details around the 
Big Bang and related physics 
questions. 



enter e-mail 



SIGN-UP 



Amount of work it does: 98,000 jobs a day, more than 1 
million per month. 

Jufjijlimj ability: Runs about 30,000 jobs concurrently, on 
average. 



RELATED SOLUTIONS 



HOW-TO 
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* How to Lay Dead Technologies to 
Rest 

■ Ghostly Gear: Technology Tools 
for Paranormal Investigations 

Research & Analysis » 

■ Lax Laptop Security Can Be 
Dangerous. ..and Expensive 

* Reduce Information Technology 
Complexity, Costs with 

Consolidation 



A<Mce & Opinion » 

■ Join the Conversation! 

■ Ask, Answer and Interact with 
the CIO Community 

News » 

■ Nokia Makes the Ferrari of 
Phones... Literally 



Holiday Gift Guide 2007: Best 
Technology Bling 



VIDEO 



Business Innovation Video Series 




IT Leaders are discussing how IT is 
becoming part of the innovation 
cycle. 



Watch the videos » 
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PODCASTS 



WEBCASTS 





530M SI2K-days/month (CPU) 



9 PB disk at CERN + Tier-1s 
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Creating a working Grid service across 
multiple infrastructure is clearly a success, 
but challenges remain 

_ Reliability 
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100% 
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Site Reliability 
CERN + Tier-1s 



Reliability viewed from grid 

Measured by set of standard 

jobs run hourly 
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Average 



Average - 8 best sites 



Target 



100% 



90* 
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Site Reliability 
Tier-2 Sites 



83 Tier-2 sites being 
monitored 



Jan Feb Mar Apr May Jun Jul Aug 
2007 2007 2007 2007 2007 2007 2007 2007 



All Sites 



Best 50% of Sites 



Best 20% of Sites 



Creating a working Grid service across 
multiple infrastructure is clearly a success, 
but challenges remain 

_ Reliability 
_ Ramp-up 
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CERN + Tier 
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CERN + Ti els -Installed and Required 
DISK Capacity (PetaBytes) 
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AprMayJun Jul AugSepOccNovDec Jan FebMar AprMayJun Jul AugSepOctNovDecJanFebMar Apr 
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□ installer targe 



Evolution of installed capacity from April 06 to June 07 
Target capacity from MoLI pledges for 2007 (due July07) 

and 2008 (due April 08) 
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Creating a working Grid service across 
multiple infrastructure is clearly a success, 
but challenges remain 

_ Reliability 
_ Ramp-up 
_ Collaboration 

• From computer centre empires to a federation 

• consensus rather than control 
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Leaf 

Logistical 



Lemon 



Performance 
& Exception 




Monitoring 



Toolkit developed by CERN in collaboration with 
many HEP sites and as part of the European 
DataGrid Project. 
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Commercial Management Suites 

(Full) Linux support rare (5+ years ago---) 

_Much work needed to deal with specialist HEP 
applications; insufficient reduction in staff costs 
to justify license fees. 

Open Source Systems 

_Many packages with interesting features, but 
none featuring all of items considered essential 

• Declarative, hierarchical configuration specification 
permitting validation, integrated software 
distribution and configuration management, 
separation of configuration data and code, feedback 
loop to avoid configuration drift, ability to update 
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See EDG/WP4 repoft: Wc&h-ftnt f^nfjft^fy , tfiftpV'7ctePi>*h/hep-proj-grid-fabric/Tools/DataGrid-04-TED-01 01-3_0.pdf) or 
"Framework for Managing Grid-enabled Large Scale Computing Fabrics" 
(http:/cern.ch/quattor/documentation/poznanski-phd.pdf) for reviews of various packages. 
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SW server(s) 



iBI 1 


^^^^^^^^^^^^ 
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RPM 




Configuration server 




SQL backend 




CDB 



XML backend 



Used by 1 8 organisations 
besides CERN; including two 
distributed implementations 
with 5 and 1 8 sites. 




XML configuration profiles 




Service Service ServiceC < base OS 



RPMs / PKGs 





Install server 



Install 

Manager 

)> — < 

System 
installer 



Managed Nodes 
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http : //lemonweb . com . ch/lemon-status/l:pl_view . php?prof ile=pro_type_lxbatch_slc3 
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Tony Cass 






- Lemon Monitoring Web Pages - C... 
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Home 



Documentation 



Alarms 



PCFinder 



Metrics 



Error trending 



Help 



Template info: pro_type_lxbatch_slc3 



17 Nov 2006 Fri 18:20:22 



[•] 



###################################################### 
# 

# template pro_type_lxbatch_slc3 
# 

# RESPONSIBLE: Thorsten Kleinwort 
# 
###################################################### 

template pro_type_lxbatch_slc3; 

include pro_software_cornponents_slc3; 
include pro_systern_lxbatch; 
include pro_os_slc3; 

"/system/cluster/tplname" = "pro_type_lxbatch_slc3"; 



# 

# Yaim for gLite 3.0 

# 

include pro_software_components_lcg_yaim_3_0; 

'/software/components/yaim/active" = true; 
'/software/components/yaim/nodetype/glite-WN" = true; 

'/software/components/yaim/configure 11 = true; # Do automatically configure YAIM 
Ysystem/cluster/subname" = "public"; 
'/system/accounting/name" = "share"; 



# 

# SPMA proxy configuation 
# 

# use head node as proxy server 
7software/components/spma/headnode"=true; 

# active SPMA proxy 
7software/components/spma/proxy"="yes"; 

"/software/components/lsfclient/lsftype" = IsftypeQ ; 
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Server cluster 



Backend 
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Frontend 

L1 proxies 




Installation images, 
RPMs, 
configuration profiles 
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DNS-load balanced HTTP 
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Node based Lemon sensors cover all the usual system 
parameters and more 

_ system load, file system usage, network traffic, daemon count, 
software version- ■■ 

_ SMART monitoring for disks 

_ Oracle monitoring 

■ number of logons, cursors, logical and physical I/O, user commits, index 
usage, parse statistics, ■■■ 

_ AFS client monitoring 



It is also possible to provide non-node sensors. At CERN 
these allow integration of 

_ information from the building management system 

■ Power demand, UPS status, temperature, ■■■ 

_ and full feedback is possible (although not implemented): e.g. system shutdown on 
power failure 

_ high level mass-storage and batch system details 

• Queue lengths, file lifetime on disk, •■■ 

_ hardware reliability data 
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Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct 
kW ayer: 68.48 max: 76.85 min: 38.52 curr: 72.23 



kVA - last year 




Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct 
kYA ayer: 71.88 max: 80.32 min: 43.27 curr: 75.52 



Aneutral - last year 




Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct 
A auer: 40.58 max: 45.50 min: 30.52 curr: 45.50 
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Dec Jan 

kwi ayer: 22.85 
kW2 ayer: 18.53 
kW3 ayer: 26.10 



Feb Mar Apr May Jun Jul Aug Sep Oct 

max: 25.82 min: 12.44 curr: 23.37 

max: 22.50 min: 11.18 curr: 18.82 

max: 28.40 min: 15.88 curr: 28.03 



kVA - last year 



Dec Jan Feb Mar Apr May Jun Jul 

kYA1 ayer: 23.81 max: 2G.81 min: 13.57 

kYA2 ayer: 20.57 max: 23.58 min: 12.33 

kYA3 ayer: 27.52 max: 30.83 min: 17. 3G 



Aug 



curr: 
curr: 
curr: 



Sep Oct 

24.32 
20.72 
30.48 



A - last year 



200 t— f 



100 



A1 ayer: 105.12 max: 117.76 min: 58.53 curr: 106.71 
A2 ayer: 80.34 max: 103.60 min: 54.14 curr: 80.66 
A3 ayer: 120.58 max: 135.31 min: 75.83 curr: 133.83 



Last 



year 



Get Fresh data 





Auto Update 



-_li find!! 



Virtual niiiTtgre fc 



nin^-pre fc 



RarJre fc- 



UW mnHfrk fc 



Hi jtj h j^P-r fc- 



Virtual rirnji-iiT-jtirin^ fc 



Fnifiipr Hi i-dri h i i+i ri n fc- 



* 
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AsL 

qua 
spe< 

in 



> * - e a 



https : //lemonweb . cern . ch/lennon-status/inf o . php?vo=vo_atlas 



- 



\p G T Google 



A. 



_ & X 



Lerr 

fly" 

_e.i 
Al 



Lemon Monitoring Web Pages - C... ( 



Home 



Documentation 



Alarms 



PCFinder 



Metrics 



Error trending 



Help 



Vo info: vo atlas 



19 Nov 2006 Sun 12:23:26 



Virtual Organization Information 

325 (14) 

2.4.21-47.EL.cernsnnp J 
2.6.9-42.0.2.EL.l,cernsmp J > 

656 (26) 

52 days, 23h:03m (boots per 
host) 

Ixbl326, Ixb0471>- 
ATD_WRONG, SPMA_ERROR, 



# of hosts (down): 


operating system(s): 


# of CPUs (down): 


average up time: 


hosts down: 


exceptions: 


ITCM history 


Select from hosts: 



None v 



Metric Distributions 



Correlations 



Load Percentages 

44.92 



10.82 



12.32 




4.32 



□ 0-0.5 

□ 0.5-1.0 

□ 1.0-2.0 

□ > 2.0 
■down 



27.72 



^ 



18:00 



oo:oo 



06:oo 



User CPU auer: 390.50m max: 9567. 51m min: 110.89m curr: 121.03m 
System CPU auer: 1179. 57m max:3015.01m min: 667.16m curr: 1508. 13m 
Nice CPU auer: 47280. 61m max: 50101 . 48m min: 41057. 70m curr: 42885. 84m 
Idle CPU auer: 50504. 46m max: 56486. 85m mi n : 38821 . 85m curr: 54742. 38m 
IO Wait CPU auer: 525.01m max: 4687. 80m min: 224.10m curr: 607.26m 
IRQ CPU auer: 18.06m max: 83.60m min: 14.88m curr: 17.78m 
Soft IRQ CPU auer: 102.70m max: 152.51m min: 84.84m curr: 107.53m 



Network utilization - last day 



vi 









30 M 



20 M •; 



10 M ■■ 



-I 1 1 — - — I 1 1 1 1 1 Y 



-t 1 1 1 1 1 1 1 1 Ir 



etho inf 
etho uu 



Last 



vo atlas 



vo cms 



18:00 



00:00 



06:00 



vo dteam 



vo I hob 



7.84M max: 28. 03M min: 11.24M curr: 11.64M 
3.43M max: 11.48M min: 1 . 05M curr: 2.48M 



v 



Get Fresh data 





CPU utilization - last day 
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Get Detailed data Auto Update 



Search host: 



Virtual Clusters ► 



Clusters ► 



Racks ► 



HW models ► 






Databases ► Virtual Organizations ► Power Distribution ► 



> 



^LE 



Last modified by mirsi (CERN IT/FIO-FS), May 08 2006 11:59:58. 



PHP version: 5.1.6 
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Introduction to CERN and Experiments 

LHC Computing 

Challenges 

_ Capacity Provision 
_Box Management 

• Installation & Configuration 

• Monitoring 

• Workflow 

_Data Management and Distribution 
_What' s Going On? 

Summary/Conclusion 
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LEAF is a collection of workflows for high level 
node hardware and state management, on 
top of Quattor and LEMON: 

HMS (Hardware Management System): 

_ Track systems through all physical steps in lifecycle eg. installation, moves, vendor 
calls, retirement 

_ Automatically requests installs, retires etc. to technicians 

_ GUI to locate equipment physically 

_ HMS implementation is CERN specific, but concepts and design should be generic 

SMS (State Management System): 

_ Automated handling (and tracking of) high-level configuration steps 

• Reconfigure and reboot all LXPLUS nodes for new kernel and/or physical move 

• Drain and reconfig nodes for diagnosis / repair operations 
_ Issues all necessary (re)configuration commands via Quattor 
_ extensible framework _ plug-ins for site-specific operations possible 
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1. Import 



Operations 



7. Request move 



2. Set to standby 




1 1 . Set to production 




6. Shutdown work order 




technicians 



1 0. Install work order 



8. Update 




9. Update 



3. Update 

i 

\ 

1 2. Update 




■ ■ 



NWDB 




V 



4. Refresh Node 



5. Take out of production 

' -Close queues and drain jobs 
Disable alarms 



/ 



1 4. Put into production 



1 3. Refresh 



58 



Simple 

_ Operator alarms masked according to system state 

Complex 

_ Disk and RAID failures detected on disk storage nodes lead 
automatically to a reconfiguration of the mass storage system 



SMS 



iset Standby 



Mass Storage System 



Alarm Analyst Alarm 

Monitor 



LEMON 



Alarm 



set Draining 



Disk Server 



emon AgentRAID degraded 



Draining: no new connections allowed; existing data transfers continue. 
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At CERN, the ELFms toolkit has allowed us 
to cope with a significant increase in box 
count with reduced staffing levels. 

We have confidence the software will scale 
further 

_ although changes needed (e.g. to cope with 
virtualisation). 

Large scale farm operation, though, remains 
a challenge! 

_ramp-up, purchasing, h/w failures, ■•• 



(even if we are not at the Google scale) 
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Introduction to CERN and Experiments 

LHC Computing 

Challenges 

_ Capacity Provision 

_Box Management 

_Data Management and Distribution 

_What' s Going On? 

Summary/Conclusion 
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Scheduled work only! 

700MB 




700MB/S 



(1600MB/S) 







:? 



W Km fe* ■! 

iS-ES MM ia> k |j 




Averages! Need to be able 
support 2x for recovery! 




420MB/S 




1 1 20MB/S 



(2000MB/S) 






Remember this figure 



* y 
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1 5PB/year. Peak rate to tape >2GB/s 

_3 full SL8500 robots/year 

Requirement in first 5 years to reread all 
past data between runs 

_60PB in 4 months: 6GB/s 

Can run drives at sustained 80MB/s 

_75 drives flat out merely for controlled access 

Data Volume has interesting impact on 
choice of technology 

_ Media use is advantageous: high-end technology 
(3592, T1 OK) favoured over LTO. 
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Multiple use cases- ■■ 

_ Sustained transfer to remote site 

• WAN visibility; I/O intensive 

_ Rapid transfer of data set to CPU node 

• LAN access; I/O intensive 

_Long running analysis access to data on server 

• LAN access, low I/O, long duration 

•••all have different footprint on disk servers 
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Commercial Mass Storage Sy< te rob have been evaluated at 
CERN, but with little success. Key systems evaluated were 

_ Lachmann/Legent OSM 

■ Still in use at DESY, but interest at CERN much reduced due to 
lack of long-term support (DESY provide their own support) 

_IBM' sHPSS 

• In use at SLAC, BNL (US labs) and IN2P3 (French Computer 
Centre) 

• Experience at CERN showed random access to files (a major use 
case) was poor; addressing this required additional software and 
disk buffers 

_ At the time, HPSS also required a DCE infrastructure and had 
limited O/S & hardware support. 

_ IEEE "vision" of companies providing pluggable 
components of an overall system didn' t work out in practice; 
we ended up with single vendors providing all the 
components- •■ 



•■■ and so CERN developments became more and more 
capable leading to Castor: CERN Advanced Storage 

Q\/CtQIYI 
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Database Centric 

_ Stateless agents; can restart easily on error 

_No direct connection from users to critical 
services 

Scheduled Access to I/O 

_No overloading of disk servers 

• Per-server limit set according to type of transfer 

_ servers can support many random access style accesses 
but only a few sustained data transfers 

_l/0 requests can be scheduled according to 
priority 

• Fair shares access to I/O just as for CPU 

• Prioritise requests from privileged users 
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^Scheduh 





DB 
Svc 


Stager 

Job Qry 
Svc Svc 


Error 
Svc 






Detailed view 
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Cluster info: c2sc4 subcluster wan 



19 Apr 2006 Wed 16:14:27 



# of hosts (down): 


operating system(s): 


# of CPUs (down): 


average up time: 


hosts down: 


exceptions: 


ITCM history 


Select from hosts: 


Metric Distributions 



Cluster Information 

43(1) 

2.6.9-34.EL.cernsnnp 

51(2) 

20 days, lSh:50im (boots per host) 

Ixfsra3004 

RPC_STATD_WRONG J MIRROR.BROKEN, 

View template 



23. 3£ 



Hone 



Correlations 

Load Percentages 

16.3% 



14 .OK 




2.3£ 



□ 0-0.5 


□ 0.5-1.0 


□ 1.0-2.0 


□ > 2.0 


Idown 



44. 2£ 



CPU utilization - last day 



100 ± 



^ 



18:00 



oo:oo 



06:00 



12:00 



User CPU auer: 8.12 max: 10.87 min: 5.82 curr: 10.47 
System CPU auer: 58.16 max: 75.51 min: 28.11 curr: 46.28 

Nice CPU auer: 1.56 max: 1.86 min: 1.01 curr: 1.64 

Idle CPU auer: 32.16 max: 64.50 min: 14.17 curr: 41.60 



Network utilization - last day 



o.o 



18:00 



oo:oo 



06:00 



12:00 



etho In auer: 684. 15M max: 1363. 16M min: 35.04M curr: 53. 25M 
etho out auer:l336.65M max:l60l.45M min:l008.6lM curr: 1541 . 81M 



Last M a V 



37 
37 





Get Fresh data Get Detailed data Auto Update 
I I I 



Sustained transfer from disk of 1 .2GB/s as data import ramps 
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Cluster info: ITDC 



03 Jan 2006 Tue 10:19:12 



# of hosts (down): 


operating system(s): 


# of CPUs (down): 


average up time: 


hosts down: 


exceptions: 


ITCM history 


Select from hosts: 


Metric Distributions 



Cluster Information 



46 (0) 



2.4. 21-37. EL. cernsmp 

62 (0) 

47 days, 14h:14nn (boots per host) 



none 



FILESYSTEM ERROR 



View template 



Hone 



Correlations 



Load Percentages 



63. Q£ 




□ 


0- 


-0 


.5 




■ 





.5- 


-1. 





□ 


1 


.0- 


-2. 


.0 1 


■ 


> 


2 


.0 







23.32 



CPU utilization - last month 



h 



100 ± 



50 - 



09-Dec 



14-Dec 



19-Dec 



24-Dec 



29-Dec 



user CPU auer: 1 . OS max: 2.05 min: 0.40 curr: o. 85 
system CPU auer: 14.30 max: 33.36 min: 0.30 curr: 7.32 
Nice CPU auer: 0.45 max: 1.42 min: o.oo curr: 0.43 
Idle CPU auer: S3. 67 max: 93.30 min: 62.97 curr: 91.35 



Network utilization - last month 



■v. 

in 
■u 



LU 



1.5 G 



1.0 G ■■ 



0.5 G - 



0.0 

In auer: 
Out auer: 



09-Dec 



14-Dec 



19-Dec 



24-Dec 



479. 30M max: 1 206. 52M mi n : 
446. 61 M max:ioos.63M min: 



0.00M curr: 405. 33M 
o.oom curr: 210. 63M 



Last month 



T 



T 







■w 



» 
» 







.....:... TV 



Get Fresh data Get Detailed data Auto Update 
1 I I I 



Sustained transfer of incoming data to tape at 1 GB/s 
Note the dates! Failed hardware was left down. 
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Cluster info: castor2 subcluster ITDC 



06 Mar 2006 Mon 03:15:45 



# of hosts (down): 


operating system(s): 


# of CPUs (down): 


average up time: 


hosts down: 


ITCM history 


Select from hosts: 


Metric Distributions 



Cluster Information 



43 (0) 



20.32 



2.4. 21-37. EL. cernsrnp, 2.4.21- 
37.0.1.EL.cernsmp 

33 (0) 

33 daySj 7h:43nn (boots per host) 

none 

View template 



None 



Correlations 

Load Percentages 

41.72 




□ 0-0.5 

□ 0.5-1.0 

□ 1.0-2.0 

□ > 2.0 



13.32 



13.32 



CPU utilization - last month 



k 



100 ± 



50 - 



o- 




— +— «-i — ^■»+- 

09-Feb 



J lj J_ 



14-Feb 



19-Feb 



24-Feb 



01 -Ma r 




User CPU auer: 699.20m max: 4045. 97m min: 251.93m curr: S46. SOm 
System CPU auer: 10455. 79m max: 44258. 64m min: 168.94m curr: 16355. 98m 
Nice CPU auer: 290.40m max: 1670. 44m min: 0.00m curr: 356.77m 
Idle CPU aue r : 88554. 43m max: 99506. 47m ml n : 52423. 99m curr: 82440. 86m 



■--. 

■U 



uj 



2.0 C "- 



1.0 C ■■ 



0.0 



Network utilization - last month 

H 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 








i 




09-Feb 



14-Feb 



19-Feb 



24-Feb 



- ft i - ^ j 



01 -Ma r 



etho In auer: 553. 08M max: 2353. 07M min: 
etho out auer: 463. 00M max: 2318. 73M min: 



0.05M curr: 718. 46M 
0.01M curr: 718. 81M 



Last 



month 



TO 



» 
» 




Get Fresh data Get Detailed data Auto Update 



Peak transfer of incoming data to tape at over 2GB/s 
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in 

in 

'l 1 



OJ 



t 



t 



4 G a 



3 G 



2 G 



1 G 



Network utilization 



Network utilization 



Network utilization 








Week 36 Weel< 


ethO 


in 


aver:1.6G max:2.9G 


et hO 


out 


aver:1.8G max:4.0G 


ethl 


in 


aver: 0.0 max:0.0 


ethl 


out 


aver: 0.0 max: 0.0 
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Week 40 






Week 42 



min:373.6M curr:1.2G 
mm: 511. 8M curr: 1. 5G 

min:0.0 curr:0.0 
man; 0. curr: 0. 
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Long data lifetime 

Disk capacity vs I/O rates 

File sizes 

Multiple Mass Storage Systems 

Organised Data Export 
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LEP, CERN' s last accelerator, started in 
1 989 and shutdown 1 years later. 

_ First data recorded to IBM 3480s; at least 4 
different technologies used over the period. 

All data ever taken, right back to 1 989, was 

reprocessed and reanalysed in 2001 /2. 

LHC starts in 2007 and will run until at least 
2020. 

_ What technologies will be in use in 2022 for the 
final LHC reprocessing and reanalysis? 

Data repacking required every 2-3 years. 

_Time consuming 

_Data integrity must be maintained 
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1996 



2000 



2006 



4GB 
10MB/S 




I/O 



250x1 OMB/s 



50GB 
20MB/S 



500GB 
60MB/S 




(^ 



20x20 MB/s 



2x60 MB/s 



2,500MB/s 400MB/S 120MB/S 





CERN now purchases two different storage server models: 
capacity oriented and throughput oriented. 

•fragmentation increases management complexity 

•(purchase overhead also increased---) 
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Tape Drive Efficiency <30 MB/s max performance* 
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file 
tape mount time ~120s 

file overhead 
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5s (400MB) 
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~ 10 files per mount 
- 3B files per mount 
_ 1 0£ 
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Files per mount 

Three 

Ten 

Fifty 

Thousand 



1000 
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CASTOR is not the so;.? t-f>S ror LHC 

_Fermilab' s dCache is used at many sites; DPM, 
a disk-only storage manager is also common. 

Users, of course, don' t want to know--- 

_■■• and experiment code needs to run at many 
sites 

_SRM, the Storage Resource Manager, provides a 
common interface layer to the various mass 
storage systems 

• See http://sdm.lbl.gov/srm-wg/ 

_These multiple and independent 
implementations of the interface all talk to each 



other. 

Key element to have successful 
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LHC experiments need to ship data between 
sites 

_Raw data export, Analysis Data updates, Monte 
Carlo data import, ••• 

This is complicated at our scale 

_with petabytes of data transferred, a 0.1 % 
failure rate can' t be easily rescued or followed 
up manually 

_ Sites policies (e.g. fraction of resources 
allocated to a given VO) must be respected 
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Developed as part of the EGEE data 
management activity to meet requirements 



f 



trr^w> 



J l—li r>\/i^^>*iyy>^^ , l- 
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r4 r*\-t-s\ 



Averaged Throughput From 01/01/07 To 13/11/07 
VO-uise Data Transfer From All Sites To All Sites 



1600 r 



1400 - 



.« 1200 - 



5 iooo - 



0. 

3 
O 
L. 




u 



1 



L 




i\ 



I 



D Alice 
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D CMS 
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D LHCb 

D OTHERS 

□ UNREGD VOs 



fl 
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Date (dd/nn) 



GRIDVIEW 




_ (But no management on the network level) 

Prevent storage overload 
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CERN 




H 



10Gfcp-i 

2.SGtopt 

G22 Mbpi 

3lOMtip* 
1S5 Mbp\ 
34/45 M*f» 



"Dark Ftbfr" lr#ik* 
provide rrm triple 
wavrlmgthi -it 

I :nk^ a-so invr back- 
up IP connreitonv 
Detaitycf the*e rait 

w** w {jean 12 /*ct 




GEAN 



Backbone Tcpology Novcnbet 2006 



EANT2 is operated by DANTE on behalf of Europe's NRENs 
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Successes: 

_We have an advanced Mass Storage System at 
CERN able to meet the demanding requirements 
for Data Acquisition and export. 

_ Large scale data transfers between sites are 
becoming routine. 

But yet to demonstrate 

_ exports for multiple experiments simultaneously 
_ operations for large scale user analysis 



most work so far has been controlled "production 



?> 
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Introduction to CERN and Experiments 

LHC Computing 

Challenges 

_ Capacity Provision 

_Box Management 

_Data Management and Distribution 

_What' s Going On? 

Summary/Conclusion 
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Site managers understand systems (we 
hope!). 

But do they understand the service? 

_and do the users? 

_and what about cross site issues? 

• Are things working? 

• If not, just where is the problem? 

_ how many different software components, systems and 
network service providers are involved in a data transfer site 
X to site Y? 
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Home Dependencies 


Admin 






Documentation Help 






IT Services 








4 Oct 2006 Wed 23: 34: OS 






choose another view 


v 








\VM 















Administrative applications "BH5 Services for physics 

lOOQtfi available (more) 



ailable (more) 



Infrastructure services 

I 99<Hi available (more) 

Worldwide LHC Computing Grid H 

68<Vb igraded (more) 




*£ For developers and engineers {}3? 

B2«/b available (more) I I 75<Vb available (more) 



Databases 



* 



99<Va available (more) 



Networking 



^ 



lOO^b available (more) 



Service performance for 16 Nov 2006 



Key Performance Indicators: 
Number of services: 

Number of valid updates: 



All OK:: Number of valid updates 
(130) was higher than minimum 
expected/ target level (150) 



(more) 
percentage: 82% 

status: available 




this service consist of: 

Castor2 
Batch service 
LXPLUS 
LXBU1LD 
LXGAT5 



Grid Tiei — 1 sites 
availability - last 7 days 



ERN LXGATE Facility (LXGATE) 
Availability: 921s, 

2 out of 24 testtd nod if *rt net reachable, 
WT3 



Kerberos 

CDB 

Lemon 

Unux software 



Thy 
5 Oct 



Fri 

6 Oct 



Sun 
S Oct 



Statistics: 



last week avg: 470*1 offc 

max: 70^ affected lost at 0? Oct 2006, 13:00 

mm; 11% available last at as Oct 2006, 22: oo 
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Source data 
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M eta services 



Services 





Engineering Services 

(Engineering) 



I 




Windows Services 

(Windows Services) 

-A- 






Administrative Services 

(Administrative) 



IT Services 



Services for Physics 
(Physics) 



ics 



CVS Services 



ervice 



s) 








Services for ATLAS 
(Atlas) 




Central CVS Service 

(CVS) 

rCVS Service for LCG 
(LCGCVS) 



depends on 




Services for Experiments 



J2EE Public Service 

(jps) 







uses 



subservice 

/ sub service 

»AFS Service for ATLAS subservice 
(AFS -ATLAS) ' subservice 




CERN AFS Service 

(AFS) 




AFS Service for ALICE 

(AFS -ALICE) 




AF 



AF > Service for CMS 

(AFS-CMS) 




AFS Service for LHCb 

(AFS-LHCb) 




CASTOR Tape Service 

(Cast or Tapes) 
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Service Level Status overview 
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Home I Dependencies 1 Adnnin 




1 Documentation j Help 




Services used by LHCb people 




19 Nov 2006 Sun 12:53:09 













Services used by LHCb 



availability: I 



(more) 
percentage: S4°/o 

status: available 



this service consist of: 

] Services for physics 




LXPLUS 

Batch service 

AFS 

AFS Service for LHCb 

LXBUILD (LHCb) 



Mail and Web Services 





Indico 
TWiki 
EDH system 



Electronic Document Handling 

system (EDH) 

Availability: 100*^ available 

23 SiteScope test(s) out of 23 succeeded 



I Castor2LHCb 
CASTOR Tape Service 



availability in the last 24 hours (more): 



^ Art — 


luu rj ^^^^ 




,■■-, II III 1 1 1 1 


12:00 


13:00 


00:00 


06:00 


12:00 



Additional information 

full name: Services used by LHCb people 

short name: Services used by LHCb 

vo: LHCb 



email: Helpdesk@cern.ch 

web site: *+ http://lhcb.cern.ch 



Availability update 



last update: 12:51:10, 19 NOV 2006 

(7 minutes ago) 

refreshed every: 2 minutes 
expires after: 60 minutes 



Admin 



admin tools 



■ 



SLS by CERN IT/FIO 



Slide 35 of 43 ■Default Design" ^S English (U.S.) 



SLS.Support@cern.ch 



\m ss ^ | 64% .3. Q 



Domain 



Monitoring Tools in use 



Grid 
Applications 



•\ 



central 
services 



Grid 

Middleware site 



services 



> 



Application 
monitoring 



j 



•\ 



> 



Grid Services 
monitoring 



Experiment Dashboards 



j 



GStat 

SAM/GridView 

GridlCE 

GridPP Real Time Monitor 



local 
resources 



^ 



site 



> 



Local monitoring 



j 



Lemon/SLS 

Nagios 

Ganglia 
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App 
Layer 



site registry 



GOCDB 



g 



• • • 



w 



(other monitoring tools) 



Experiment/VO 

ATLAS 



Experiment/VO Experiment/VO 



GOCDB, BE ] pj^\ 




one per experiment 
Exp. Dashj ag 

— n —> VO jobs, data 

Exp. Dash^ Q site reliability^ 

jFrP/XMLpull 
[~] job state 




real time 3D job view 



html 



\^\ site status + graphs 



A 
hTTP/XMLpull 



*^>-mf*Jv^vj^ 



SAM -BQ» 



Gnd 
View 



results 






data transfer, job status, 
service availability 

O ^4 ii "fill 



Fabric 
Resources 



batch 



□ □ □ 

□ □ □ 

□ □ □ 

CPUs 



00 



TBs 



i ii* 



GOCDB, extE Grid | C E _Q BDII + § | 1 1 E 

fabric/job infos = " ' z = 




LEMO^J-Q fabric infos 



JIB. ssw" 



If! 1^ fai -Bt-rak Inl fa 






» H> ■ F-* 



• • • 
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Grid monitoring data Is complex! 

_And there are many sites- 
Current tools visualize data by sorted tables, 
bar charts, etc. 



Difficult to present an easy to understand 
top-level view which provides 

_ quick, action oriented oversight and insight 

_help understand job failures and availability 
patterns 

Can new visualizations help? 
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Idea 



visualize the Grid by using Treemaps 
(Grid + Treemap = GridMap) 



regions 



Example GridMap 



site 



UKI 




Size of rectangle is e.g. 

- size of site (#CPUs) 

- #running jobs 



CERN 



France 



NorthemEurope 



GermanySwi'tzeiiand 



Italy 



SouthEastemEurope 



CentralEurope 



AsiaPacifk 



SouthWesternEurope 



Russia 



m 
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Idea 



visualize the Grid by using Treemaps 
(Grid + Treemap = GridMap) 



ok 



degraded 



down 



Example GridMap 



Colour of rectangle is e.g 

- SAM status of site / service 

- Availability of site / service 
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GridMaps can be used for top-level, geographical and VO 



views 



Top-level View 



Global GridMap 

France Germany SwiCentralEi 



Geographical 
Views 



arge-scale 
Federated Grid 
Services 
Infrastructure 




Application Domain GridMap 



Local GridMap Local GridMap Local GridMap 



Next level of GridMaps 



VO Views 
cross-location 



Federation, 
Partner, 
Site, etc. 
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Trends can be understood by looking at a sequence of 
GridMaps 

Site Availability over time: 




KI-NORTHGRID-MAN IN2P3-CC-T2 




IN2P3-CC 



UKI-LT2-OMUL UKI-NORTHGR UKI-SCO 



TOKYO-LC IN2P3-LF UKI-NOF IFCA-L( Draauel 
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UKI-S< 



IFIC-L(BEarid USC 
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UKI- UKI- AMD IH2F 

esi-u 

5ESY-HH CYFRC UKI-E UKI- Heoh LIP- 

CIEMA 



JKI-LT2-B MPPMI 



UAM-BelG 



UB-U RWT 



LIP HE LR 
IN A Ur-R< 



NDGF-T1 RAL-LCG2 



BElf 




o 
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UKI- UKI- AMD IH2F 
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UAM-BelC 



LIFHE LR 



UKI-LT2-B MPPMI _._.._. INA U|,R( 

S.GNE UB . LtRW T BE - f M 



-Tl Taiwan-LC . Tv 



TRIUMF-LCG2 Die SA[ 
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UKI- UKI- AMD IN2F 
GSI-LC 
HCVFRC UKI-EUKI Hesh LIP- 
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UAM-BelC 



-B MPPMI.... 



Si<3NE" UB _ L(RWT 



o 



BE IT F 



20 Sep 2007 
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|UKI-S< 



IFIC-LtBEarid USC-l 



TierO 




UKI-LT2-It UKI " L " 

- - ■ 
GSI-LC 
DESV-HH CVFRC UKI-E UKI- Heoh LIP- 
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IN A U|,R< 
BE If F 




KI-LT2-B MPPMI 



SiGNE" UB . LtRWT 
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TRIUMF-LCG2 Die 
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UAM-BelC 
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TRIUMF-LCG2 Die S 



« 




23 Sep 2007 



24 Sep 2007 



25 Sep 2007 





0% 25% 50% 75% 100% 



93 



Correlations of metrics can be discovered by switching 
between different views 



Site Availability from different VO perspectives: 



sites without colour do 
not support the VO 





.P~Th ^ MW 



OPS 






Alice 



Atlas 



CMS 



LHCb 



Status of different Site Services: 



0% 25% 50% 75% 100 = 




LpmunjVu.u-l.i-prljrvJ ifnl 



■■nil Idil .-mppnf 



l i- 



"■ " 





Overall Site 



CE 






SRM 



site BDII 



Down Degraded Ok 
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Ratio of CPU : Wall clock Times 
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0% 
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month (2007) 



Introduction to CERN and Experiments 

LHC Computing 

Challenges 

_ Capacity Provision 

_Box Management 

_Data Management and Distribution 

_What' s Going On? 

Summary/Conclusion 
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mmense Challenges & Complexity 

_ Data rates, developing software, lack of standards, 
worldwide collaboration, ■•• 

Considerable Progress in last ~5-6 years 

_ WLCG service exists 

_ Petabytes of data transferred 




But real data is nearly here--- 

the system cope with chaotic analysis? 

we understand the system enough to identify 
problems— and fix underlying causes? 

Major "Dress Rehearsals" in Feb & May 2008 

■ last chance to shake system down before operation 




Answer(s) at LISA 



08? 
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